尽管U-NET体系结构已广泛用于分割医学图像,但我们解决了这项工作中的两个缺点。首先,当分割目标区域的形状和尺寸显着变化时,香草U-NET的精度会降低。即使U-NET已经具有在各种尺度上分析特征的能力,我们建议在U-NET编码器的每个卷积模块中明确添加多尺度特征图,以改善组织学图像的分割。其次,当监督学习的注释嘈杂或不完整时,U-NET模型的准确性也会受到影响。由于人类专家在非常精确,准确地识别和描述所有特定病理的所有实例的固有困难,因此可能发生这种情况。我们通过引入辅助信心图来应对这一挑战,该辅助信心图较少强调给定目标区域的边界。此外,我们利用深网的引导属性智能地解决了丢失的注释问题。在我们对乳腺癌淋巴结私有数据集的实验中,主要任务是分割生发中心和窦性组织细胞增多症,我们观察到了基于两个提出的增强的U-NET基线的显着改善。
translated by 谷歌翻译
New technologies and the availability of geospatial data have drawn attention to spatio-temporal biases present in society. For example: the COVID-19 pandemic highlighted disparities in the availability of broadband service and its role in the digital divide; the environmental justice movement in the United States has raised awareness to health implications for minority populations stemming from historical redlining practices; and studies have found varying quality and coverage in the collection and sharing of open-source geospatial data. Despite the extensive literature on machine learning (ML) fairness, few algorithmic strategies have been proposed to mitigate such biases. In this paper we highlight the unique challenges for quantifying and addressing spatio-temporal biases, through the lens of use cases presented in the scientific literature and media. We envision a roadmap of ML strategies that need to be developed or adapted to quantify and overcome these challenges -- including transfer learning, active learning, and reinforcement learning techniques. Further, we discuss the potential role of ML in providing guidance to policy makers on issues related to spatial fairness.
translated by 谷歌翻译
There is an increasing interest in developing artificial intelligence (AI) systems to process and interpret electronic health records (EHRs). Natural language processing (NLP) powered by pretrained language models is the key technology for medical AI systems utilizing clinical narratives. However, there are few clinical language models, the largest of which trained in the clinical domain is comparatively small at 110 million parameters (compared with billions of parameters in the general domain). It is not clear how large clinical language models with billions of parameters can help medical AI systems utilize unstructured EHRs. In this study, we develop from scratch a large clinical language model - GatorTron - using >90 billion words of text (including >82 billion words of de-identified clinical text) and systematically evaluate it on 5 clinical NLP tasks including clinical concept extraction, medical relation extraction, semantic textual similarity, natural language inference (NLI), and medical question answering (MQA). We examine how (1) scaling up the number of parameters and (2) scaling up the size of the training data could benefit these NLP tasks. GatorTron models scale up the clinical language model from 110 million to 8.9 billion parameters and improve 5 clinical NLP tasks (e.g., 9.6% and 9.5% improvement in accuracy for NLI and MQA), which can be applied to medical AI systems to improve healthcare delivery. The GatorTron models are publicly available at: https://catalog.ngc.nvidia.com/orgs/nvidia/teams/clara/models/gatortron_og.
translated by 谷歌翻译